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Abstract 


More  and  more  low  level  vision  algorithms  are  being  carried  out  in  the  spatial 
frequency  domain,  using  Gabor  filters.  There  are  two  basic  problems  concerned  with 
Gabor  filterings  we  will  address  in  this  paper.  One  is  the  window  size  problem,  in 
which  we  will  adopt  a  set  of  2D  variable  window  Gabor  filters,  and  compare  its  per¬ 
formance  with  those  of  fixed  window  filters.  We  will  show  that  the  variable  window 
scheme  is  more  adaptive  to  image  contents,  while  fixed  window  schemes  may  suf¬ 
fer  either  large  errors  or  instabilities  when  image  contents  are  changed.  The  other 
problem  we  will  address  is  the  stability  of  amplitude  and  phase  information  resulting 
from  convolving  the  filters  with  images.  We  will  extend  Fleet’s  ID  phtise  stability 
analysis  to  2D  phase  and  amplitude  stability  analysis  based  upon  the  assumption  of 
local  resemblance  of  filter  outputs  to  a  single  sinusoid.  Applications  on  focus  qual¬ 
ity  measurement  and  2D  correspondence  are  described,  and  the  results  demonstrate 
improvements  of  performance  by  detecting  unstable  information  using  the  criterion 
developed. 

Keywords:  Computer  vision.  Low-level  processing,  Gabor  filter.  Depth  from  defo¬ 
cus,  Depth  from  stereo,  2D  correspondence. 
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1  Introduction 


This  paper  addresses  two  problems  raising  in  using  Gabor  filters  to  extract  informa¬ 
tion  from  images.  The  first  problem  is  the  window  size  problem,  in  which  we  use  a  set 
of  variable  window  Gabor  filters  to  extract  information  at  e<ich  frequency  band.  The 
benefit  of  this  set  of  filters  is  that  we  no  longer  need  to  adjust  the  window  size  param¬ 
eter  for  every  experiment.  The  other  problem  we  will  address  is  about  the  stability 
problem  caused  by  finite  windows.  We  will  extend  Fleet’s  ID  phase  stability  analysis 
to  2D  phase  and  amplitude  stability  analysis,  and  develop  a  stability  constraint  in  a 
more  general  way.  Finally,  we  will  demonstrate  the  use  of  the  tools  in  solving  two 
important  vision  problems,  focus  and  correspondence. 

In  traditional  approaches,  the  window  size  for  low  level  operators  is  tuned  manually 
for  specific  experiments.  Such  a  tuning  process  is  usually  undesirable  for  an  algorithm 
to  be  flexible  and  stable.  Intuitively,  to  extract  information  of  low  frequency,  large 
windows  are  essential  for  the  information  to  be  stable,  while  for  higher  frequency  com¬ 
ponents,  smaller  windows  are  preferred  to  preserve  locality.  This  suggests  a  variable 
window  scheme  which  decomposes  a  fixed  window  into  a  set  of  windows  whose  sizes 
are  directly  related  to  their  frequencies.  A  set  of  self-similar,  rotation  and  translation 
invariant  2D  Gabor  filters,  which  cover  the  whole  spatial  frequency  plane  up  to  the 
Nyquist  frequency,  has  been  used  to  extract  information  from  images.  By  combining 
results  from  different  frequency  bands,  a  focus  algorithm  based  on  such  decomposi¬ 
tion  demonstrates  greater  flexibility  and  adaptiveness  than  those  based  on  the  fixed 
window  size  scheme. 

It  is  well  known  that  windowing  in  the  spatial  domain  is  equivalent  to  convolving 
in  the  frequency  domain,  which  limits  overall  frequency  resolution.  When  amplitude 
and  phase  information  are  extracted  from  windowed  filtering,  it  is  possible  that  they 
are  severly  contaminated  by  the  convolution.  Therefore,  if  ainy  algorithm  makes  use 
of  that  information  implicitly  or  explicitly  without  any  further  examination  of  the 
stability,  it  is  subject  to  either  large  error  or  total  failure.  One  goal  of  this  paper 
is  to  provide  a  set  of  constraints  which  are  capable  of  identifying  the  contaminated 
information,  and  a  generic  framework  for  using  information. 

Focus  quality  measurement  is  a  typical  problem  using  Fourier  amplitude  informa¬ 
tion.  The  problem  cam  be  formulated  as  measuring  the  change  of  amplitude  between 
two  images.  Applying  the  set  of  2D  variable  window  Gabor  filters  and  the  technique 
of  identifying  unstable  amplitude  information,  we  can  show  that  the  performance  of 
the  algorithm  c£in  be  improved  in  stability,  adaptiveness  and  precision. 

2D  matching  is  another  typic2d  problem  which  can  be  formulated  using  phase 
information.  Because  shifting  in  the  spatial  domain  is  equivalent  to  phase  change 
in  the  frequency  domain,  we  can  measure  phase  chainges  to  infer  the  spatial  shift 
between  two  images.  As  the  same  as  we  do  in  focus  quality  measurement,  we  will 
apply  the  set  of  filters  and  the  technique  of  stability  analysis  to  the  problem,  and  the 
performance  of  the  algorithm  can  be  improved. 

It  is  worth  noting  that  the  generic  framework  of  using  information  extracted  from 
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the  set  of  filters  is  not  limited  to  any  specific  application.  While  in  this  paper,  we 
demonstrate  its  application  on  focus  quality  measurement  and  2D  intensity  image 
matching,  the  framework  can  certainly  be  extended  to  any  vision  algorithm  which 
makes  use  of  either  amplitude  or  phase  information. 

2  Related  Research 

As  an  alternative  way  of  performing  visual  computing  in  spatial  domain,  the  spatial 
frequency  approach  has  been  favored  by  many  researchers  for  the  applicability  of 
various  signal  processing  techniques  [12]  and  biological  evidence  [4l  Previous  research 
on  visual  computing  in  the  frequency  domain  has  been  concentrating  on  four  areas, 
namely,  motion  analysis,  stereo  matching,  texture  analysis,  and  focus  measure. 

Adelson  and  Bergen  [1]  and  Heeger  [9]  modeled  motions  in  2D  image  space  as 
orientations  in  3D  spatiotemporal  space,  therefore,  introduced  3D  oriented  filters  to 
measure  image  velocities.  More  recently.  Fleet  and  Jepson  [6]  modeled  the  normal 
velocity  as  a  function  of  local  phase  changes,  and  then  use  Gabor  filters  to  measure 
changes  of  phase  at  every  pixel  location.  To  the  stereo  matching  problem,  Weng  [20], 
Sanger  [18],  Fleet  et  al.  [7],  and  Langley  et  al.  [15]  proposed  to  use  filters  to  extract 
phase  information,  then  compute  disparities  from  them,  while  Jones  and  Malik  [11] 
applied  a  set  of  linear  spatial  filters  to  images,  and  use  responses  from  those  filters 
as  matching  features.  The  spatial  frequency  approach  also  achieved  great  success  in 
texture  segmentation  [3,  10,  14],  and  shape  recovery  from  texture  [13,  16]. 

One  of  the  major  disadvantages  of  the  spatial  frequency  approach  is  the  airtifact 
introduced  by  windowing,  which  may  cause  substantial  error  if  unnoticed.  The  usual 
way  to  overcome  this  problem  has  been  to  use  large  windows  so  that  the  artifact  is 
negligible,  but  in  price  of  severely  reduced  resolution.  Fleet  and  Jepson  [5]  provided 
an  excellent  way  to  analyze  the  stability  of  phase  information.  One  of  the  major 
goals  of  this  paper  is  to  generalize  their  work  to  stability  analysis  of  both  phase  and 
amplitude  information. 

Measuring  focus  quality  through  spatial  frequency  aaeilysis  has  been  proposed 
in  the  literature  [22,  2,  19,  17].  Few  of  the  reported  results  have  addressed  the 
stability  problem  of  amplitude  information,  which  is  the  only  information  used  in 
focus  quality  measurement.  We  will  show  that  the  stability  analysis  is  capable  of 
eliminating  unstable  amplitude  information,  therefore  improving  the  performance  of 
focus  quality  measurement. 

3  Variable  Window  Gabor  Filters 

Limited  by  the  uncertainty  principle  [8],  any  filter  must  compromise  between  spatial 
resolution  Ax  and  spatial  frequency  resolution  Af  The  Gabor  filter,  which  is  a 

^In  this  paper,  /  always  refers  to  an  angular  frequency,  i.e.  the  wavelength  A  =  ^ 
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complex  sinusoid  modulated  by  a  Gaussian  function,  is  the  one  which  can  achieve 
minimal  product  of  spatial  uncertainty  and  frequency  uncertainty: 


G{x,f;<T) 


\/2ir( 


xa 


(1) 


where  the  spatial  extent  er  decides  the  tradeoff  between  the  spatial  resolution  and  the 
frequency  resolution. 

If  the  spatial  extent  <t  is  constant  across  different  frequency  bands,  i.e.  we  compute 
the  spectrogram  [22],  the  filters  for  low  frequency  bands  will  have  much  smaller  num¬ 
bers  of  waves  than  those  for  high  frequency  bands.  Therefore,  the  spatial  localization 
will  be  effectively  reduced  at  high  frequency,  while  in  spectral  domain,  the  spectral 
localization  in  term  of  octave  (logarithmic  frequency  band)  will  be  effectively  reduced 
at  low  frequency.  Suppose  the  spatial  extent  <t  vary  linearly  with  the  wavelength  of 
the  tuned  frequency,  i.e., 

(2) 

we  then  obtain  a  set  of  variable  window  Gabor  filters: 


G{x,f) 


(3) 


Extending  the  ID  variable  window  Gabor  filters  to  2D,  we  have  the  2D  Gabor 
filter; 

Gix,  y,  u,  v)  -  g{x,  ,  (4) 

where  g{x,y)  is  an  elliptical  2D  Gaussian  function  in  general,  and  (u,t;)  is  the  2D 
peak  frequency  of  the  filter.  _ 

The  radial  frequency  of  this  filter  is  /  =  y/u^  +  v^  and  orientation  9  =  tan(u/u). 
Also  it  is  usually  more  convenient  to  have  the  modulating  elliptical  Gaussian  with  the 
same  orientation  9  as  the  filter.  Figure  1  illustrates  the  peak  frequency  position,  the 
spectral  extent  of  one  filter  in  the  frequency  domain,  aind  the  real  part  of  the  filter  in 
the  spatial  domain.  A  remaining  free  parameter  is  the  aspect  ratio  of  the  elliptical 
Gaussian.  Inspired  by  some  biological  evidence  reported  in  [4],  we  choose  the  aspect 
ratio  as  two-thirds. 

As  we  did  for  the  ID  Gabor  filter,  we  constrain  the  spatial  extent  in  the  9  direction 
of  the  2D  filter  to  be  proportional  to  the  radial  wavelength,  we  then  obtain  a  set  of 
2D  Gabor  filters  which  are  translated,  rotated,  and  dilated  or  contracted  versions  of 
each  other.  In  the  experiments  we  show  below,  we  used  a  set  of  120  2D  filters,  which 
have  10  different  radial  frequencies  and  12  orientations.  The  ki  in  Eq.  3  is  set  to 
IT.  The  spectral  extents  of  filters  are  described  in  Figure  2,  assuming  the  extent  of 
g{x\  a)  is  from  —a  to  or. 

As  we  will  show  in  experiments,  the  benefit  of  using  such  a  set  of  filters  with 
different  size  of  spatial  support  is  that  when  results  from  different  frequency  bands 
are  combined,  if  the  image  has  strong  high  frequency  components,  the  final  result 
will  be  strongly  influenced  by  results  from  high  frequency  bands,  and  therefore,  it  is 
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Figure  1:  Oriented  2D  Gabor  Filter  in  Frequency  and  Spatial  Domains 


Figure  2:  The  Set  of  2D  Variable  Window  Gabor  Filters 
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based  more  on  very  local  information.  On  the  contrary,  if  the  image  contains  strong 
low  frequency  components,  the  final  result  will  be  based  upon  less  local  information. 
We  no  longer  need  to  tune  window  size  every  time  when  processing  new  images. 

4  Stability  of  Filtering 

Though  we  obtain  an  amplitude  value  and  a  phase  value  at  every  pixel  location  by 
convolving  the  image  with  one  filter,  both  of  them  may  be  so  severely  contaminated 
by  either  windowing  or  noise  that  it  is  no  longer  valid  to  use  them  as  approximations 
of  the  real  amplitude  and  phase  values.  Therefore,  any  algorithm  using  them  without 
discretion  will  potentially  result  in  substantial  errors.  The  goal  of  this  section  is  to 
provide  a  way  to  quantify  such  a  bias  from  real  values  and,  subsequently,  reduce  errors 
caused  by  those  contaminations. 

4.1  Window  Contamination 

The  origin  of  the  window  contamination  is  the  convolution  in  the  spatial  frequency 
domain  caused  by  window  multiplication  in  the  spatial  domain.  In  other  words,  we 
axe  extracting  amplitude  and  phase  values  not  just  at  a  single  frequency,  but  at  a 
weighted  sum  of  a  band  of  frequencies.  The  behavior  of  the  sum  may  or  may  not  be 
similar  to  that  of  a  single  frequency. 

For  simplicity,  let  us  first  consider  the  ID  case.  For  the  sum  of  different  frequency 
components  to  behave  like  a  single  frequency  component,  we  ask  the  local  behavior 
of  the  sum  to  satisfy  following  criteria, 

1.  The  phase  should  change  linearly  w.r.t.  the  position,  i.e.  the  derivative  of  the 
phase  w.r.t.  the  pixel  position  should  be  the  frequency. 

2.  The  phase  should  be  stable  w.r.t.  the  tuning  frequency,  i.e.  the  phase  should  be 
constant  when  the  tuning  frequency  is  shifted  slightly. 

3.  The  amplitude  should  be  stable  w.r.t.  the  position,  i.e.  the  derivative  of  ampli¬ 
tude  w.r.t.  the  pixel  position  should  be  zero. 

4.  The  amplitude  should  be  stable  w.r.t.  the  tuning  frequency,  i.e.  the  amplitude 
should  be  constant  when  the  tuning  frequency  is  shifted  slightly. 

Apparently,  a  complex  sinusoid  satisfies  all  the  criteria  perfectly^.  Therefore,  for 
the  sum  to  approximate  a  single  sinusoid  locally,  it  has  to  approximately  satisfy  those 
criteria  locally.  On  the  other  hand,  if  the  sum  does  satisfy  all  the  constraints,  then 
in  the  spatial  domain,  the  bandpassed  signal  can  be  well  approximated  by  a  sinusoid 
locally.  Consequently,  we  regard  locally  unstable  signals  resulting  bandpass  filterings 
as  those  which  can  not  be  well  approximated  by  a  single  sinusoid. 

^For  criterion  4,  the  amplitude  is  constant  only  when  the  tuned  frequency  is  approximately  equal 
to  the  frequency  of  the  sinusoid. 


5 


Amplitude 


Then  the  task  left  is  to  find  one  or  more  constraints,  which  are  easy  to  compute,  to 
indicate  to  what  extent  those  criteria  are  satisfied.  Let  us  further  simply  the  problem 
by  assuming  the  signal  is  a  sum  of  two  independent  sinusoids  of  frequency  /o  and  fi, 
and  weighted  amplitude  oq  and  Oi.  Then  the  sum  is: 

ae-'^  =  aoe^^  +  (5) 


Let  ^  =  A,  A<f>  =  —  00  and  A/  =  /i  —  /o,  after  some  manipulations,  the  criteria 

1  and  3  can  be  expressed  as: 


~  fo 

AA/(A  +  cos  A0) 

A^  +  2A  cos  A0  +  1 

1  da 

AA/  sin  A0 

a  dx 

A^  +  2A  cos  A0  +  1 

(6) 

(7) 


When  the  tuning  frequency  is  shifted  as  in  Figure  3,  it  will  affect  the  amplitude 
ratio  of  two  components. 


dX 

df 


^-(A/-4/)*<r3/3 

!i!idizp7r 


ape 


df 


A(c(^7)(rf/)<r*  _  1) 

df 


=  XAfa^ 


(8) 


where  <7  is  the  spatial  extent  of  the  filter. 

Using  Equation  8  and  d(log/)  =  we  can  express  the  criteria  2  and  4  as: 


T2  = 


d<j> 

Tf 


fdxdf 
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_  a'*fA  fX  sin  A<f> 

A*  4-  2A  cos  A4>  +  1 

T  -  L—  -  L—— 

adf  ~  adXdf 

_  a^fA  fX{X  +  cosA4>) 

+  2A  cos  A<f>  +  1 

Summing  those  criteria  together,  we  obtained  an  overall  criterion  for  the  stability 
of  the  signal  which  is  composed  of  two  independent  sinusoids: 

+  Tl  +  n  +  Tl  =  (1  +  +  r|).  (11) 

Apparently,  if  «  0,  the  four  criteria  listed  previously  are  approximated  satisfied. 
In  other  words,  the  local  behavior  of  the  sum  is  similar  to  that  of  a  single  sinusoid 
when  sa  0. 

The  rightmost  expression  in  Eq.  11  provides  another  benefit  for  the  computation 
of  T,  i.e.  we  don’t  actually  need  to  compute  Ta  or  T4,  which  usually  requires  high 
density  frequency  sampling  because  they  are  derivatives  with  respect  to  frequency. 
In  practice,  the  computations  of  Ti  and  T3  are  straightforward.  Note  the  difference 
between  the  criterion  proposed  here  and  those  in  [5].  The  criterion  proposed  here 
is  more  general  in  that  only  when  Eq.  3  is  satisfied,  they  axe  equivalent,  and  the 
criterion  in  Eq.  11  is  for  both  amplitude  and  phase  information. 

Figure  4  illustrates  the  stability  criterion  when  applied  to  the  sum  of  a  fixed  fre¬ 
quency  sinusoid  and  a  chirp  signal,  i.e.  the  frequency  increases  linearly  with  respect 
to  pixel  locations,  with  the  same  magnitude  (A  =  1).  The  upper  two  graphs  show  Af 
and  A<l>  in  Eq.  11,  the  lower  left  graph  is  the  signal  itself  in  spatial  domain,  and  the 
lower  right  graph  shows  the  computed  stability  criterion  T.  Obviously,  the  spikes  in 
the  stability  criterion  are  caused  by  asynchronous  sinusoids,  i.e.  those  sinusoids  are 
concealing  each  other  because  the  phase  difference  is  approaching  ±7r,  and  the  gradual 
increase  in  the  stability  criterion  is  caused  by  the  increasing  frequency  difference. 

As  proved  in  Appendix  A,  Eq.  11  is  valid  for  an  arbitrary  signal.  Generalizing 
Eq.  11  to  any  signal,  and  taking  the  spatial  extent  into  consideration,  we  obtain  the 
criterion  function  T'  as  in  Eq.  12: 


where  ki  is  the  constant  in  Eq.  3,  and  /o  is  the  tuning  frequency. 

Elimination  of  unstable  information  can  be  done  by  simple  thresholding,  i.e.  if 
at  any  location  and  any  frequency  band,  T'  exceeds  a  certain  threshold,  then  the 
amplitude  and  phase  information  in  that  frequency  band  is  regarded  as  unstable. 
Notice  that  the  threshold  should  be  a  constant  with  respect  to  different  frequency 
bands. 


(9) 

(10) 
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Figure  4:  Stability  Criterion 

In  case  of  2D  Gabor  filters  proposed  previously,  let  us  assume  the  X  axis  is  co¬ 
incident  with  the  radial  orientation  u  of  a  filter,  the  four  criteria  can  be  expressed 
as: 


Ti  =  Vr<t>  -  U  = 

A(A  -f  cos  A<f>)  j 

fAu\ 

(13) 

A^  +  2A  cos  A<f>  +  1  ’ 

[Av)^ 

T2  =  u<j>  = 

Ati<7^  sin  A4>  j 

(*)' 

(14) 

A^  -f  2A  cos  A<f>  +  1  ’ 

T3  =  -V,a  = 
a 

A  sin  A^  j 

f  Au\ 

(15) 

A*  -I-  2A  cos  A<l>  +  1  ' 

^  At;  j’ 

u 

T4  =  -V„a  = 
a 

A«o’^(A  -f  cos  A<f>)  j 
A*  -f  2A  cos  A<f>  +  1  ' 

r  AtiN 

1  t  j’ 

(16) 

where  Vr  is  the  gradient  operator  in  spatial  domain,  Vu  is  the  gradient  operator 
in  spatial  frequency  domain,  u  is  the  peak  tuning  frequency,  Au  and  At;  are  the 
frequency  difference  in  X  and  Y  directions,  and  k2  is  the  aspect  ratio  of  the  filter. 
Similarly  the  overall  stability  criterion  can  be  expressed  as: 

||rT  =  <T=«(||STi||^-H|STsin,  (17) 

Where  the  S  is  the  aspect  scaling  matrix, 
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4.2  Noise  Contamination 


Due  to  different  window  sizes  of  filters,  the  effects  of  noise  on  the  amplitude  and 
phase  values  axe  also  different.  Assuming  additive  white  noise,  the  noise  component 
included  in  filter  output  can  be  expressed  as: 


N{fo)  =  r"  -  /o; 

J —oo  O  J—oa  (7 


(18) 


where  g{f  —  /o;  is  a  Gaussian  function  with  center  at  /o  and  extent  n  is  the 
magnitude  of  noise  at  every  frequency,  <f>n  is  the  random  phase  of  noise. 

Because  we  assumed  white  noise,  the  expected  value  of  the  rightmost  integral  term 
in  Eq.  18  is  assumed  to  be  a  complex  number  with  constant  magnitude  and  random 
phase.  Therefore,  using  Eq.  2,  we  obtain  the  relative  noise  level  in  every  frequency 
band. 

\\N(fo)\\  =  hfo,  (19) 


or  in  the  2D  case, 


||lV(fo)||  =  t3l|folP, 


(20) 


where  ks  is  a  constant. 


5  Applications 

The  general  framework  developed  in  this  paper  is  applicable  to  any  vision  task  which 
makes  use  of  either  amplitude  or  phase  information.  We  will  explain  two  specific 
applications,  focus  quality  measurement  and  2D  correspondence.  The  reason  we 
choose  these  two  applications  is  as  we  will  see,  focus  quality  measurement  makes  use 
of  amplitude  information  only,  and  2D  correspondence  makes  use  of  phase  information 
only. 


5.1  Focus  Quality  Measurement 

As  explained  in  [22,  17],  the  key  problem  in  focus  quality  measurement  can  be  stated 
as,  given  two  images  which  are  blurred  to  different  extent,  how  to  measure  locally  the 
difference  of  blurring  at  each  pixel  location.  Modeling  the  blurring  as  a  convolution 
with  a  Gaussian,  we  have[17], 

=  yv'|ln||/o(/)||^-lnl|/,(/)||*l,  (21) 

where  cr'  represents  the  blurring  difference,  and  ||/o(/)||  and  ||/o(/)||  are  amplitude 
values  of  two  images  at  frequency  /. 

We  can  apply  the  set  of  2D  variable  window  Gabor  filters  to  extract  amplitude 
information  at  each  frequency  band,  eliminating  unstable  amplitude  information  by 
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thresholding  in  Eq.  17,  then  fit  all  other  stable  amplitude  values  and  their  included 
noise  level  (Eq.  20)  into  the  following  linear  (w.r.t.  equation[21]. 


a«||f||“  =  lln||/„(/)||'-ln||/.(/)||^  +c, 


(22) 

where  the  constant  c  is  used  to  compensate  illumination  difference  between  two  im¬ 
ages,  and  instead  of  the  peak  tuning  frequency,  ||f  |p  can  better  approximated  by  the 
average  of  two  instantaneous  frequencies  ||  j(Vr^o  + Vr^i)|p.  Ordinary  estimation 
can  be  applied  to  the  fitting  with  the  uncertainty  of  the  right  side  being  expressed  as 
[21], 


(23) 


ii/o(f)ii  ii/i(f)ir 

The  iterative  estimation  in  [22]  can  be  also  efficiently  implemented  as  convolutions 
in  the  spatial  frequency  domain. 


I{x)  (2)  g{x]  a')  (g)  /(x)  =  (/(x)  (2)  /(x))  <Si  g{x-,  a') 
where  /(x)  is  a  bandpass  filter. 


(24) 


5.2  2D  Correspondence 

The  key  problem  in  2D  correspondence  is  to  find  the  spatial  shift  between  two  images 
at  each  pixel  location.  Because  a  spatial  shift  is  equivalent  to  a  phase  shift  in  frequency 
domain,  we  can  infer  the  spatial  shift  from  phase  difference  [18,  7,  15]. 

An  obvious  approach  similar  to  that  of  focus  quality  measurement  is  to  apply  the 
set  of  2D  variable  window  Gabor  filters  to  extract  phaise  information  at  each  frequency 
band,  eliminate  unstable  phase  information  by  thresholding  in  Eq.  17,  then  minimize 
the  following  to  find  disparity  r. 


stable 


((<^1  —  00  —  f  •  r)  mod  27r)^ 

fllMOll  .  Mm? 
VW)  ^  IMDiy 


(25) 


where  N{i)  is  the  noise  component  in  Eq.  20. 

The  iterative  estimation  in  [7]  can  be  done  by  simply  shifting  one  image  locally 
according  to  previously  estimated  disparities. 


6  Experimental  Results 

6.1  Elimination  of  Unstable  Information 

First  we  artificially  convolved  an  image  with  a  Gaussian  function  g{x-,a  =  1.0),  then 
convolved  the  set  of  filters  with  original  and  blurred  images,  and  analyzed  the  relations 
between  amplitudes  of  various  frequencies  at  an  arbitrarily  chosen  pixel  location.  The 
left  half  of  Figure  5  shows  the  relative  error  of  estimating  a  (Eq.  22)  from  a  single 
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x100% 


Before  Elimination  After  Elimination 

Figure  5:  Elimination  of  Unstable  Amplitude  Information 

frequency  band  without  eliminating  unstable  amplitude  information^,  and  the  right 
half  shows  the  error  after  thresholding  by  V  =  1.25  in  Eq.  17. 

To  illustrate  the  effectiveness  of  eliminating  unstable  phase  information,  we  spa¬ 
tially  shifted  a  image  by  r,  and  then  the  phase  difference  represents  the  error  of 
phase  with  respect  to  ideal  sinusoids. 

^  -  {4>i  -  <f>2^  mod  27r,  (26) 

where  (j>\  and  <f>2  are  the  two  phase  values  resulting  from  convolving  a  filter  with  the 
original  and  shifted  images.  Ideally,  (p  should  be  zero  at  any  frequency  band.  At  an 
arbitr«irily  chosen  pixel  location,  the  left  half  of  Figure  6  shows  (p  of  every  frequency 
band  without  eliminating  unstable  phase  information,  and  the  right  half  shows  ip  after 
thresholding  by  T'  =  1.25  in  Eq.  17. 

As  illustrated  in  Figure  5  and  Figure  6,  the  identification  of  unstable  information 
is  indeed  very  accurate.  Depending  on  a  specific  application,  the  threshold  can  be 
changed  to  satisfy  looser  or  tighter  requirement  for  the  output  of  a  bandpass  filter  to 
resemble  a  sinusoid. 

As  illustrated  in  Figure  5  and  Figure  6,  there  also  exists  a  strong  correlation  be¬ 
tween  instability  and  amplitude  values.  This  suggests  an  amplitude  thresholding 
scheme  ([21]),  which  assumes  unstable  amplitude  and  phase  information  is  caused  by 
low  amplitude  values.  Even  though  this  amplitude  thresholding  scheme  can  indeed 
works  well  in  some  cases,  when  we  compared  with  the  stability  thresholding,  we  found 

^In  this  case  c  in  Eq.  22  is  zero. 


11 


Before  Elimination  After  Elimination 

Figure  6:  Elimination  of  Unstable  Phase  Information 

that  it  usually  requires  images  have  strong  high  frequency  component  because  oth¬ 
erwise  weak  high  frequency  components  are  always  below  the  threshold  even  though 
they  art  stable  with  respect  to  the  stability  criterion.  On  the  other  hand,  if  we  lower 
the  amplitude  threshold,  a  large  number  of  unstable  information  will  get  through. 
In  the  experiments  described  below,  the  amplitude  threshold  is  set  to  0.06  (refer  to 
Figure  5  and  Figure  6),  assuming  images  are  normalized  so  that  their  DC  components 
have  amplitude  value  of  1.0. 

Figure  7  shows  two  images  of  two  textured  surfaces  at  difference  depth.  The  images 
are  taken  under  different  lens  apertures,  which  make  the  difference  of  focus  quality  a 
function  of  depth  [21].  We  applied  the  set  of  filters  to  the  two  images,  then  eliminated 
unstable  amplitude  information  by  thresholding  the  stability  criterion,  and  fitted  a 
line  against  Eq.  22  to  obtain  </  at  each  pixel  location.  Figure  8  shows  the  difference  of 
focus  quality  in  the  rectangular  region  in  Figure  7.  As  we  can  see,  the  two  surfaces  are 
separated  obviously,  and  the  depth  discontinuity  is  well  located.  Figure  9  shows  the 
computed  focus  quality  difference  by  amplitude  thresholding.  The  area  with  strong 
high  frequency  components  are  recovered  correctly,  while  in  other  areas,  the  results 
show  instability. 

Figure  10  shows  two  images  of  a  rotating  ball.  We  applied  the  techniques  of  fil¬ 
tering  and  stability  analysis  to  the  two  images,  and  use  Eq.  25  to  find  2D  disparities. 
Figure  11  shows  the  2D  correspondence  between  the  two  images.  Intrinsic  to  the  al¬ 
gorithm  itself,  subpixel  accuracy  is  achieved  without  any  interpolation.  Given  sparse 
features  as  those  on  the  ball,  the  algorithm  can  automatically  avoid  unstable  informa¬ 
tion  obtained  in  featureless  areas.  Figure  12  shows  the  2D  correspondence  using  the 
amplitude  threshold.  Around  the  boundary  of  the  ball,  where  the  contrast  is  lower. 
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Figure  7:  Two  Images  Taken  Under  different  Lens  Aperture 


Figure  8;  Difference  of  Focus  Quadity  Using  Stability  Threshold 
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Figure  9:  Difference  of  Focus  Quality  Using  Amplitude  Threshold 
the  algorithm  generate  unstable  results. 

6.2  Fixed  Window  vs.  Variable  Window 

The  selection  of  window  size  for  any  window  operation  is  a  compromise  between 
resolution  and  stability.  While  large  windows  reduce  effects  of  noise,  and  make  results 
more  stable,  the  resolution  is  reduced  at  the  same  time.  On  the  other  hand,  small 
windows  have  the  advantage  of  preserving  locality,  i.e.  high  resolution,  they  are 
potentially  unstable.  Therefore  a  proper  window  size  has  to  be  ad  hoc  to  a  specific 
problem.  An  implicit  rule  of  choosing  a  window  size  is  to  select  a  window  which  is 
as  small  as  possible  while  the  error  caused  by  noise  and  windovdng  is  still  within  an 
acceptable  range. 

The  alternative  way  of  selecting  a  window  size  every  time  when  processing  new 
images,  is  to  adopt  the  variable  window  scheme  proposed  in  this  paper.  Because 
window  sizes  are  proportional  to  wavelengths,  this  variable  window  scheme  uses  large 
windows  when  frequencies  are  low,  small  windows  when  frequencies  are  high. 

Figure  13  and  Figure  14  show  an  example  of  input  images  with  different  contents. 
The  images  in  Figure  13  contain  a  considerable  2unount  of  high  frequency  information, 
while  the  images  in  Figure  14  contain  only  low  frequency  information.  The  black 
lines  in  the  images  are  the  locations  where  focus  quality  differences  are  measured. 
Focus  quality  measurements  are  done  using  the  fixed  window  scheme  [22]  with  a 
large  window  (<r  =  20.0)  and  a  smzdl  window  (<7  =  10.0),  and  the  variable  window 
scheme  described  in  previous  sections. 

Figure  15  shows  focus  quality  difference  of  a  single  scan  line  between  images  in 
Figure  13.  The  large  window  scheme  resulted  in  a  large  slope  around  the  depth 
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Figure  10:  Two  Images  of  a  Ball 


Kow 


Figure  11:  2D  Correspondence  Using  Stability  Threshold 
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Figure  12:  2D  Correspondence  Using  Amplitude  Threshold 


Figure  13;  Nearly  Focused  Image  Pair 
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Figure  14:  Defocused  Image  Pair 


Nearly  Focused  Image  Pair 

Defocused  Image  Pair 

Small  Fixed  Window 

0.1218 

0.2708 

Large  Fixed  Window 

0.1677 

0.1947 

Variable  Window 

0.1150 

0.1664 

Table  1:  RMS  Errors  of  Different  Window  Schemes 


discontinuity.  The  small  window  scheme  displayed  more  noisy  results  before  and  after 
the  discontinuity  even  though  it  located  the  discontinuity  more  precisely.  Figure  16 
showed  the  results  of  focus  quality  difference  between  images  in  Figure  14.  This 
time,  the  small  window  scheme  generated  unstable  results,  and  the  large  window 
scheme  produced  a  blurred,  but  stable  results.  In  both  Figure  15  and  Figure  16, 
the  variable  scheme  resulted  in  a  meaningful  compromise  between  resolution  and 
precision.  Table  1  shows  root  mean  square  errors  of  all  cases.  The  variable  window 
scheme  performs  significantly  better  than  the  large  window  scheme  in  the  nearly 
focused  image  pair,  and  the  small  window  scheme  in  the  defocused  image  pair. 

7  Conclusion 

This  paper  provide  a  general  framework  of  visual  computing  in  spatial  frequency 
domain.  We  addressed  two  main  problems  in  spatial  frequency  analysis,  one  is  the 
window  size  problem,  the  other  is  the  stability  problem  associated  with  windowing. 
We  compared  the  variable  window  scheme  with  the  fixed  window  scheme,  and  intro¬ 
duced  a  more  general  stability  criterion.  And  we  showed  the  effectiveness  of  these 
two  tools  in  applications  of  focus  quality  measurement  and  2D  correspondence. 
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Focus  QiuUiy  DUtavaca 


Figure  15:  Focus  Quality  Diflference  of  the  Necirly  Focused  Images 


Focus  QimUqt  Diflerenoe 


Figure  16:  Focus  Quality  Difference  of  the  Defocused  Images 
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A  Stability  Criteria  for  An  Arbitrary  Signal 

Let  us  tissume  an  arbitrary  signal  t(a:),  and  its  Fourier  transform 

/(/)  =  r~t(x)e-^^dx.  (27) 

*/— OO 


Its  Gabor  transform  in  any  location  xo  and  any  peaJc  frequency  fo  can  be  represented 
as: 


Then,  we  have 


Similarly, 


1  jL  / 

G(xo,/o)  =  -i=-  /  i(x-xo)e  ^*®dx 
V2xa  J-oo 

•/  — OO 

=  a(xo,/o)cJ^<"o-^'») 

=  Ro(xQy  fo)  -f  j/o(xo,  fo) 


Ro^ 


dxQ 


T 


d<t> 

dxo  ~  + 

^  +  Jolt 

adxo  Rg  +  /g 


fo  da 
a  dfo 


=  fo 


=  fo 


D  ^/p  7  dRti 

~ 

m+is 

Rg  +  i? 


Let’s  define  Gi(xo,/o): 


Gi{xo,fo)  = 


3  --si-  -  t 

, — —  /  i(x  —  xo)xe  •'•^®dx 

V^<T^J-oo  ^  ’ 

•/— OO 

^1  +iA> 


(28) 

(29) 


(30) 

(31) 
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(32) 

(33) 


where  the  equivalence  between  Eq.  33  and  Eq.  32  can  be  easily  verified  by 
following  property  of  Fourier  transform: 

using  the 

(34) 

where  T  denotes  the  Fourier  transform  operator. 

% 

Now,  we  have. 

dG  dRo  .dio 

dxo  dxo  dxo 

4 

1  r+oo  **  .. 

= - 7=—  /  i\x  —  xo)e  ^  e~'^°^dx 

V^inr<7  J-oo 

(35) 

1  f+°°  X  I*  -- 

==  /o-  / 

y/2ir<T  J-oo  a* 

(36) 

=  j{Gi{xo,  fo)  —  foG{xo,  /o)). 

(37) 

where,  from  step  35  to  step  36,  we  used  the  partial  integration  method: 

J  u'(x)v{x)dx  =  «(x)t?(x)  —  j  u{x)v\x)dx 
Using  the  same  technique,  we  have, 

dfo  ~  dfo  ^^dfo 

•/— OO 

=  -<r’Gi(xo,/o) 

Replacing  Eq.  37  and  Eq.  38  into  Eq.  28  29  30  31,  we  obtain, 

„  _  _  f  —  +  loh 

'  ~  dxo  + 

^  ,d<f>  a^MloRi-Roh) 

^  ~  m+n 

_  _  fp-Ri  —  Roll 

^  ~  adx~  ilg  +  /g 

T  —  —  ^^fojRoR'i^  -^0^1 ) 

"  -  adfo~  R^  +  Ii 

Therefore,  we  have, 

rp2  ,rp2 - i _ (rp2  ,  -Tl2\  ^  +  Ij 


(38) 

(39) 

(40) 

(41) 

(42) 

(43) 
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